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Abstract 

Many  financial  data  are  now  collected  at  an  ultra-high  frequency, 
such  as  tick-by-tick.  However,  increasing  the  observation  frequency 
while  keeping  the  time  span  of  the  observation  fixed  does  not  always 
help  in  estimating  parameters.  A  different  type  of  consistency,  the 
consistency  of  a  estimator  as  the  observation  frequency  goes  to  in- 
finity, becomes  important  in  studying  high  frequency  data.  In  addi- 
tion to  the  consistency,  the  deviation  of  a  financial  time  series  from 
a  continuous  process  is  also  increasingly  significant  as  the  observation 
frequency  increases.  This  deviation  is  not  negligible  and  causes  an- 
other difficulty  in  estimating  parameters.  This  paper  concentrates  on 
constructing  estimators  of  variance  parameter  using  contaminated  ob- 
servations; i.e.,  observations  from  a  continuous  process  with  deviation 
at  time  of  observation.  The  consistencies  of  these  estimators,  as  the 
observation  frequency  goes  to  infinity,  are  analyzed. 


Key  Words:  f-consistency;  observation  noise;  quadratic  estimator. 


1      Introduction 

I  start  with  an  example  of  estimating  the  mean  parameter  /j  in  a  simple 
process  dB(t)  =  fidt  +  adW(t).  For  a  fixed  span  of  observation  interval  and 
a  =  1,  does  increasing  the  observation  frequency  help  the  estimation?  This 
type  of  question  has  come  up  recently  in  studying  high  frequency  financial 
time  series.  We  now  accumulate  more  and  more  financial  data  not  because 
time  goes  fast,  but  because  data  are  recorded  more  frequently.  We  have 
gone  from  quarterly  data  to  monthly  data  to  weekly,  daily  and  now  tick-by- 
tick  data.  How  does  more  data  help  us  in  estimating  parameters?  It  may 
surprise  many  people  to  know  that  increasing  observation  frequency  while 
keeping  the  span  of  observation  fixed  does  not  always  help  in  estimating 
the  parameter.  In  the  case  of  estimating  the  mean  parameter,  increasing 
observation  frequency  does  not  help  the  estimation  at  all.  When  a  is  given, 
the  minimum  sufficient  statistic  for  the  mean  parameter  is  the  difference  of 
the  two  end  observation  points.  The  difference  of  these  two  end  observation 
points  does  not  change  as  the  observation  frequency  increases.  However, 
when  the  variance  parameter  is  interested,  increasing  observation  frequency 
does  help  the  estimation.  The  quadratic  variation  is  a  consistent  estimator 
of  the  variance  parameter  as  the  observation  frequency  in  the  limit.  This 
raises  a  new  consistency  problem,  f-consistency,  the  consistency  when  the 
observation  frequency  goes  to  infinity  while  time  span  keeping  fixed. 

There  is  another  issue  associated  with  using  high  frequency  data.  When 
the  observation  frequency  increases,  the  difference  of  financial  data  from  a 
continuous  process  becomes  increasingly  significant.  For  example,  as  obser- 
vation frequency  increases,  the  variance  of  price  increment  does  not  approach 
zero.  The  first  order  autocorrelation  of  the  increments  is  strongly  negative. 
We  often  neglect  such  a  difference  in  using  low  frequency  data  such  as  daily 
or  monthly  prices.  This  difference  is  not  negligible  in  high  frequency  data. 
However,  this  should  not  keep  us  from  using  a  continuous  process  for  high 
frequency  data.  I  suggested  (Zhou  1991)  that  the  high  frequency  financial 
data  can  be  viewed  as  observations  from  a  diffusion  process  with  observation 
noises: 

S(t)  =  P(t)  +  tt,     te[a,b),  (1) 

where  P(t)  is  a  diffusion  process 

dP(t)  =  fi(t)  +  a(t)dWt.  (2) 


I  call  the  diffusion  process  the  signal  process  and  the  et  observation  noise. 
The  observation  noise  is  the  deviation  of  data  from  the  continuous  process 
and  is  assumed  to  be  independent  from  the  diffusion  process.  Many  things 
contribute  to  this  observation  noise.  In  the  currency  market,  for  example, 
non-binding  quoting  error  is  part  of  the  noise.  In  other  markets,  bid  and 
offer  difference  also  contributes  to  the  observation  noise.  Many  other  micro- 
structural  behaviors  are  all  included  in  this  so-called  observation  noise.  For 
low  frequency  observations,  the  observation  noise  is  overwhelmed  by  the  sig- 
nal change.  When  observation  frequency  increases,  the  signal  change  becomes 
smaller  and  smaller  while  the  size  of  the  noise  remains  the  same.  The  noise 
totally  dominates  the  price  change  in  ultra-high  frequency  data.  Viewing 
high  frequency  data  as  observation  with  noise  certainly  captures  many  basic 
characteristics  of  high  frequency  financial  time  series  mentioned  above. 

In  this  paper,  I  concentrate  on  constructing  the  estimators  of  the  vari- 
ance parameter  using  noisy  high  frequency  observations.  The  f-consistency 
is  investigated  for  each  estimator.  Without  loss  of  generality,  I  assume  that 
the  time  span  considered  here  is  [0,1],  which  can  be  an  hour  or  a  month.  The 
parameter  to  be  estimated  is 

a2  =  f  a2{t)dt  =  Var(P(l)  -  P(0)).  (3) 

Jo 

This  paper  is  organized  as  follows.  In  Section  2,  I  study  the  f-consistency 
of  the  maximum  likelihood  estimator  under  the  assumption  of  Gaussian  noise 
and  a  constant  variance  parameter.  In  Section  3,  I  explore  the  estimator 
by  the  method  of  moment  under  more  relaxed  assumptions.  In  Section  4, 
I  construct  an  optimal  quadratic  estimator.  In  section  5,  I  investigate  the 
sensitivity  of  each  estimator  to  its  assumptions  and  give  an  overall  evaluation 
of  each  estimator. 


2      F-consistency  of  The  Maximum  Likelihood 
Estimator 

In  this  section,  I  assume  that  the  process  (1)  has  independent  Gaussian 
noise  with  constant  variance  and  the  signal  process  (2)  has  ^(t)  —  //£,  a(t)  = 
a.   Under  these  assumptions,  I  can  obtain  a  maximum  likelihood  estimator 


(MLE).  Solving  the  different  equation  (2),  1  have 

S(t)  =  nt  +  aW(t)  +  tu     te[0,l], 


(4) 


where  W{t)  is  a  standard  Winner  process  and  et  are  independent  Gaussian 
random  variables  with  mean  zero  and  variance  rj2.  Taking  n  +  1  equally 
spaced  observations  from  [0,1],  I  have  {So,n,Si,n,~-,Sntn}  such  that 


■X-in  —  "Jt.n  —  <Jt-l,Ti  —  —  +       r=^i  ~^~  C»  — "  e*-l 

n      y/n 


(5) 


where  Zj  is  a  standard  Gaussian  random  variable.  The  joint  distribution 
of  {Xi:n,...,Xn,n}  is  a  multivariate  normal  distribution  with  mean  zero  and 
variance  matrix 


n 


In  +  V2K 


(6) 


where  /„  is  an  identity  matrix  and 
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The  eigenvalues  and  eigenvectors  of  this  matrix  are  known  (Gregory  and 
Karney,  1969) 


Ai  =  4sin2( 


2(n+l) 


) 


and 


Wi  =  (2/(n+l))1/2 


/  sin(nr/(n  +  1))     \ 
sin(2i7r/(n  +  1)) 


(8) 


(9) 


\  sin(m7r/(n  4-  1))  / 


The  log-likelihood  function  of  X  is 


»> 


Hv,*W;X)  =  -?iog(27r)  -\\og\Ln\  -  hx  -  ^)TJ:-\x  - 

z  z  z  n  n 


The  derivatives  of  the  log-likelihood  function  with  respect  to  each  param- 
eter are 

dl(n,a2,r}2;X)    =    /a-_jWnte-i£  /n) 

dp  n       n  n 

^^^    =    -itr(S„-M„)  +  I(X-^Sn-M„Sn-'(X-^)(13) 
077^  z  z  n  n 

Rewrite  matrix  A  as 

^n  =  VnAnV?,  (14) 

where  e  is  a  vector  with  all  elements  1  and  Vn  =  (vi,...,vn)  is  the  matrix 
consisting  of  eigenvectors  defined  in  (8)  and  An  =diag(Aj)  is  a  diagonal  matrix 
with  diagnal  elements  being  the  eigenvalues  defined  in  (9).  The  inverse  of 
the  covariance  matrix 

^n1  =  V^diag(  \  )V?t 

Let 

(Yl,...,Yn)T  =  V?X 

Let  v.i  =  J2j  Vij  and  notice  that  £,  Vij  =  1.  The  MLE  of  p,  a2  and  T]2  is  the 
solution  of  equations: 

ft  (<r2/n  +  v2K)      n  ttt  (a2/n  +  772Ai)  *     ; 

0  V  l  1  V     {Yi  ~  ^,n)2  rttt 

£{  2n{a2/n  +  V2K)      h  M°2/n  +  rfKY  K     ' 

ttt  2(a2/n  +  r/2At)      ,tt  2(a2/n  +  ^A,)2  ^     ' 


The  Fisher  information  matrix  of  (p,a2,r)2)  is 
/(/i.a2,^2)     = 


(  Eij[Z-%/n2  0  0 

0  ^trfE^1)        ^(E-M^-1) 


V  0  £tr(5£Mn£^)    ^tr(S-MnS;Mn) 
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2n(<r2/„+T,2A02 
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^  2n(cr2/n+r)2A.)2         ^  2(7PJW+^iF     J 


The  MLE  of  the  mean  is 


Am  =  n(£ 


Vfo 


)/(£ 
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ttt  (<r2Ai  +  »W  v^  (*7*  +  V2K) 


:) 


(18) 


Using  the  scoring  method,  the  variance  parameter  and  the  variance  of  noise 
can  be  solved  by  the  iteration  of 


2,(fc) 
2,(fc) 

Vm 


(  ^k~l) 


+  i(°r-l\vT-1)r1 


(19) 


where 


dtl) 


^[      ^H^-V/nW^K)  +  2n2(^1Vn-H7^fc-1>Ai)2J 
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A. 


2-A  „/     2,(fc-l)  ;      "       2,(Jfc~l)  . 
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2.(fc-l)/„J_„2,(k-l)A  x2' 


Vn+r, 


and  I(a2,r)2)  is  the  lower-right  corner  sub-matrix  of  the  information  matrix 
(18). 

Theorem  1   77ie  asymptotic  behavior  of  the  information  matrix  is 
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/(/^W)  = 


where  y=T]2/a7 
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(20) 
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The  proof  can  be  found  in  the  Appendix. 
It  is  easy  to  prove  that 


Var(/tM)  =  E 


1 


^(aVn  +  ^Ai) 


J"1  =  O(v^) 


The  variance  diverges  as  the  observation  frequency  increase.  It  is  worse 
than  the  estimator  ft  =  £„„  —  50,n,  which  has  constant  variance  for  any 
observation  frequency.  Instead  of  using  the  MLE  of  fi,  ft  =  £„,„  —  S0>n  is 
used  in  this  section  as  the  estimator  of  mean  parameter  fi.  The  classical 
asymptotic  results  about  MLE  do  not  apply  here  because  we  are  considering 
the  observation  frequency,  rather  than  the  time  span,  goes  to  infinity.  To 
investigate  the  f-consistency  of  the  estimator,  I  conduct  a  series  of  simulations 
for  using  different  a2,  signal-to- noise  ratio  7  =  if /a2.  For  each  observation 
frequency  n,  I  simulate  100  series  of  noisy  observations  as  in  process  (4). 
Then  I  calculate  100  MLE's  and  their  sample  mean  and  sample  variance. 
The  results  are  given  in  Table  1.  Empirical  results  indicate  that  the  MLE  is 
f-consistent  and  the  convergence  rate  of  the  variance  of  the  MLE  is  simmilar 
to  the  inverse  of  the  information  matrix  (20).  That  is 

Var(^)    =    ^  +  0(-L)  (21) 

Var(^)    =    ^-+0^)  (22) 

n  n 

When  there  is  no  observation  noise,  the  MLE  of  the  variance  parameter  is  the 
quadratic  variation  a2  =  £  X?n  —  (£  Xhn)2/n.  The  variance  of  the  quadratic 
variation  estimator  is  cr4/n.  For  both  mean  and  variance  parameter,  the 
variances  of  MLE's  converge  y/n  slower  when  there  are  observation  noise. 

Because  the  eigenvalues  of  matrix  An  are  known,  it  is  not  too  expensive  to 
computing  the  MLE.  However,  there  are  several  setbacks  for  this  estimator. 
First,  the  variances  of  high  frequency  financial  data  are  extremely  unevenly 
distributed  among  all  observations;  i.e.,  the  <rt  =  a  is  often  violated.  Second, 
the  noise  in  (4)  is  often  not  normally  distributed  and  may  be  dependent. 
Third,  the  iteration  (19)  needs  a  reasonable  initial  guess.  In  the  next  sec- 
tion, I  look  for  the  estimator  by  the  method  of  moment  under  more  relaxed 
assumptions. 


Table  1:  Empirical  Mean  and  Variance  of  MLE 


n 

100 

500 

1000 

7 

E0 

var(0) 

E§ 

var(0) 

E§ 

var(0) 

0.1 

a2 

0.953 

0.365 

<J2  =  \ 
1.04435 

0.131 

1.02014 

0.0761 

"2 

0.099 

2.37e-4 

0.10035 

4.77e-5 

0.10070 

2.24e-5 

0.01 

a2 

0.976 

0.092 

0.99370 

0.049 

0.98798 

0.0257 

v2 

0.0104 

6.96e-6 

0.01000 

7.23e-7 

0.01006 

3.34e-7 

0.001 

a2 

0.952 

0.062 

1.00262 

0.017 

1.00383 

0.0094 

v2 

0.00130 

1.46e-6 

0.00101 
a2  =  10 

3.42e-8 

0.00100 

8.42e-9 

0.1 

a2 

9.025 

22.930 

9.722 

13.4747 

9.793 

7.1973 

v2 

1.036 

0.0326 

0.995 

4.92e-3 

1.003 

0.0020 

0.01 

a2 

10.625 

10.002 

10.022 

4.2879 

9.625 

3.2054 

v2 

0.095 

4.91e-4 

0.101 

5.56e-5 

0.101 

2.28e-5 

0.001 

a2 

10.371 

8.5926 

9.973 

1.4881 

9.846 

1.0128 

v2 

0.0084 

1.71e-4 

0.0100 

2.85e-6 

0.0101 

7.01e-7 

For  different  parameters  and  100  replications,  this  table  gives 
empirical  simulation  of  the  mean  and  variance  of  MLE's  in  (16) 
and  (17).  The  variance  of  the  MLE  of  a2  decreases  in  order  of 
8a4,fy/y/n  and  the  variance  of  the  MLE  of  rj1  decreases  in  order 
of  2?74/n. 
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3      Estimating  the  Variance  Parameter  by  the 
Method  of  Moments 

Assuming  that  the  observation  noises  are  independent  with  a  finite  fourth 
moment,  I  can  construct  a  very  simple  estimator  by  the  method  of  moment: 

?  =  £(*?«  +  2Xi,nXi-i,n)  -  (E  *>,n)2/n.  (23) 

The  second  term  converges  to  zero  and  therefore  is  negligible  as  n  — ►  oo.  For 
simplicity,  I  assume  that  \i  =  0.  The  estimator  (23)  simplifies  to 

&MU  =  £(*?n  +  2Xi,„Xi_lin)  (24) 

This  estimator  does  not  require  any  distribution  assumption  on  the  noises. 
The  noise  can  be  non-stationary.  The  estimator  is  nearly  unbiased.  The 
mean  of  this  estimator  is 

E{g2mm)    =    Y,Wn  +  ri!  +  vLi-2vU) 

=    °2  +  V$-vl  (25) 

where  a2n  —  Jt^  n  a2dt  and  rj2  is  the  variance  of  observation  noise  e*.  Unfor- 
tunately, this  estimator  is  not  f-consistent  if  the  majority  of  noises  is  non-zero. 
The  variance  of  the  estimator  is 

Var(a^M)    =    £(2<n  +  4a2nV2  +  Aa2_l<na2n  +  Aa2_hn-q2_x  +  ^i-i,nVi 

+±nhU  +  toLijti  +  toilmU)  +  v{nA)  +  vi4\       (26) 

where  n\  is  the  fourth  moment  of  noise  Cj.  Suppose  that  rj2  is  the  minimum 
non-zero  value  of  rj2,  let  m  be  the  number  of  rj2  >  rj2  and  m  be  proportional 
to  n,  then 

Var(^M)  >  J>ftk)  >  4  5>.2E^-2/«  =  4m2/nV\ 

If  all  a2n  =  a2 In  and  e*  are  independently  and  identically  distributed  (i.i.d.) 
with  mean  zero  variance  7y2,  then 

Var(<7^M)  =  6a4/n  +  I6a2n2  +  Sriv4.  (27) 


The  optimal  observation  frequency  is  n  =  JZ/A/^.  The  minimum  variance 
of  the  estimator  is 

Var(^M)  =  29.867/V. 

This  estimator,  after  a  certain  point,  is  getting  worse  as  the  observa- 
tion frequency  goes  to  infinity.  However,  because  of  its  simplicity,  I  want 
to  investigate  if  there  is  an  f-consistent  estimator  with  this  simplicity.  In 
the  next  section,  I  construct  an  optimal  quadratic  estimator  of  the  variance 
parameter. 

4     Quadratic  Estimator 

In  this  section,  I  study  the  estimator  in  following  quadratic  form 

a%  =  XTQX  (28) 

where  X  =  (AT1>n, ...,  Xntn)T  and  Q  is  any  n  x  n  matrix.  Similar  to  the  last 
section,  I  assume  /i  =  0.  Oq  has  mean  and  variance: 

Eo%    =    tr(QE)  (29) 

Var(aJ)    =    tr(gEQS)  (30) 

Assume  that  ofn  =  a2/n  and  rj2  =  rf. 


2  2 

En  =  —In  +  r?An  =  Kndiag(—  +  W)K 
n  n 


where  Vn  and  Aj  are  defined  in  (8)  and  (9).     Vn  is  symmetric,  therefore 
V?  =  Vn.  Let 

<2„  =  VnQnVn,  (31) 

then 

E&1    =    £&(— +  W)  (32) 

t         n 

Var(aJ)    =    ^9-J(^  +  Alr72)2  =  ^4E9j(1  +  Ai7)2)  (33) 

a        n  ij       n 
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where  7  =  if /a2,  the  signal-to-noise  ratio.  Obviously,  aQ  is  unbiased  if  and 
only  if 

^2qi,=n     and     Yl^^  =  °  (34) 

i  i 

Theorem  2  For  any  given  r,  the  solution  of 

minQ^4(-  +  Air)2 

subject  to    ^  g.i  =  n     and     ^  g-jjAj  =  0 

i  i 

is  g'jj  =  0  for  i  ^  j  and 


a  +  pXi 


(35) 


*"      2(l/n  +  A,r)2 
w/ierie  o:  and  /?  satisfy 

2n    =    a^(l/n  +  Atr)2+/?^(l/n  +  Alr)2 
Ai  A2 

a^  (l/n  +  A,r)2  +  ^  (l/n  +  A,r)2 

The  proof  is  given  in  the  Appendix. 

For  any  given  r,  the  best  quadratic  estimator  is 

*5  =  £&!?,  (36) 

i 

where  g«  is  defined  in  (35). 

Theorem  3  For  the  optimal  quadratic  estimator  aq,  with  Q  defined  in  (31) 
and  (35),  the  asymptotic  convergence  rate  of 

y/r  2^   V™        V" 
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The  proof  is  again  given  in  the  Appendix. 

From  Theorem  3,  for  any  given  r,  the  quadratic  estimator  &q,  with  Q 
defined  in  (31)  and  (35)  is  f-consistent.  The  prefered  value  of  r  is  7,  which 
is  unknown,  r  should  be  chosen  in  order  of  7,  but  should  not  be  too  small. 
When  r  is  close  to  7,  the  performance  of  the  quadratic  estimator  is  similar  to 
MLE  and  it  does  not  need  any  distribution  assumption  about  noises.  Other- 
wise, the  variance  does  not  decrease  as  the  signal-to-noise  ratio  7  dereases. 

Without  assumption  of  constant  variances,  it  is  very  difficult  to  find  an 
f-consistent  estimator.  In  the  next  section,  I  empirically  examine  the  sensitiv- 
ity of  both  MLE  and  the  quadratic  estimator  to  the  assumptions  of  constant 
variance  and  Gaussian  noises. 

5      Non-normal  Noises  and  Unequal  Variances 

In  applications  to  financial  market,  both  assumptions  of  constant  variance 
and  Gaussian  noises  are  not  valid.  The  variance  of  the  prices  is  changing 
over  time,  especially  among  high  frequency  observations.  The  noise  is  hardly 
Gaussian.  In  this  section,  I  examine  the  sensitivity  of  both  the  MLE  a2M 
and  the  quadratic  estimator  Oq  to  their  assumptions.  The  following  six  time 
series  are  simulated  and  used  in  estimating  the  overall  variance: 

Series  I:  a2n  =  a2  /n  and  the  noise  e*  is  i.i.d.  r/t(5),  a  t  random  variable  with 
a  degree  of  freedom  5. 

Series  II:  a2n  =  a2/n  and  the  noise  a  is  i.i.d.  ^Bernoulli  (p)  with  p  =  .5. 

Series  III:  a2n  is  sampled  from  uniform  distribution  U(0, 1)  and  then  re-scaled 
such  that  J2<Ti>n  =  (T2,  the  noise  e»  is  i.i.d.  ryt(5). 

Series  IV:  of  nis  sampled  from  lognormal  distribution  LN(0, 1)  and  then  re- 
scaled  such  that  Yjv"in  =  °"2>  the  noise  e^  is  i.i.d.  r/t(5). 

Series  V:  ofnis  sampled  from  Bernoulli(p)  with  p  =  0.1  and  then  re-scaled 
such  that  Z)ofn  =  a2  and  the  noise  et  is  i.i.d.  7/Bernoulli(p)  with  p  =  .5. 
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Series  VI:  a2n  =  a2/n  and  the  noise  t{  MA(1)  with  MA  coefficient  0.5  and 
noise  77t(5). 

In  the  simulation,  following  values  are  used  for  various  parameters:  a2  = 
I,  T]2  =  0.01  and  n  =  100,500  and  1000.  The  empirical  results  are  listed  in 
Table  2. 

The  first  two  series  have  non-Gaussian  noises.  For  t  and  Bernoulli  random 
noises,  both  a2M  and  Oq  show  the  variance  convergence  rate  of  l/y/n.  The 
MLE  takes  advantages  of  smaller  sigmal-to-noise  ratio  in  series  II  and  has 
smaller  sample  variance  of  the  estimates. 

The  next  three  series  have  unequal  variances  over  time.  Again,  both  a2M 
and  Oq  show  the  variance  convergence  rate  of  l/-y/n.  The  performances  of 
two  estimators  are  somewhat  similar.  The  MLE  is  slightly  better.  For  small 
n,  both  estimators  slightly  under  estimate  the  variance.  The  bias  disappears 
as  the  observation  frequency  increases.  More  variation  in  a2n  causes  more 
bias  in  both  estimators.  However,  asymptotically,  both  estimators  are  not 
sensitive  to  this  deviation  from  the  assumption  of  equal  variance.  Many  other 
simulations  have  confirmed  above  findings. 

Series  VI  has  correlated  observation  noises.  The  MLE  clearly  has  signifi- 
cant bias  that  does  not  go  away  as  observation  frequency  increases.  However, 
the  quadratic  estimator  performed  much  better.  The  bias,  if  any,  is  negli- 
gible. The  variance  of  o2M  converges  at  rate  of  1/y/n.  For  this  set  of  data, 
the  mean  squared  error  of  the  quadratic  estimator  is  about  the  same  as  ones 
using  series  I-V.  Therefore,  the  quadratic  estimator  has  advantages  of  being 
not  sensitive  to  correlation  among  noises.  The  quadratic  estimator,  in  other 
cases,  can  be  used  as  an  initial  guess  of  the  MLE.  I  end  this  section  by  giving 
a  summary  table  (Table  3). 

6     Discussion 

A  misleading  perception  is  that  the  more  data  there  is,  the  better.  Increas- 
ing observation  frequency  while  keeping  time  span  constant  does  not  always 
help  parameter  estimation.  An  estimator  developed  for  low  frequency  data 
may  not  be  usable  for  high  frequency  data.  The  observation  noise,  which 
does  not  decrease  as  the  observation  frequency  increases,  is  the  key  obsta- 
cle. The  name  of  observation  noise  is  sometimes  misleading  in  the  financial 
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Table  2:  Sensitivity  of  a2M  and  Oq  to  Their  Various  Assumptions 


n 

100 

500 

1000 

E9 

var(0) 

Ed 

var(0) 

E§ 

var(0) 

Series  I 

&M 

0.955 

0.1246 

1.008 

0.0502 

1.027 

0.0412 

a% 

0.985 

0.1862 

0.987 

0.0797 

1.003 

0.0573 

Series  II 

ft 

1.009 

0.0847 

0.984 

0.0282 

1.002 

0.0161 

&l 

0.961 

0.2528 

0.997 

0.0938 

1.047 

0.0519 

Series  III 

&h 

1.018 

0.2038 

1.001 

0.0584 

0.990 

0.0397 

*l 

0.944 

0.1288 

1.0239 

0.104 

0.976 

0.0535 

Series  IV 

*i, 

0.946 

0.1681 

0.989 

0.0560 

0.998 

0.0383 

-I 

0.939 

0.1621 

0.981 

0.0820 

0.993 

0.061 

Series  V 

^M 

0.930 

0.2085 

0.972 

0.0484 

1.006 

0.0300 

*l 

0.892 

0.2614 

0.969 

0.0757 

0.991 

0.0692 

Series  VI 

*2f 

1.958 

0.8852 

3.398 

2.1119 

3.734 

1.9708 

o% 

1.052 

0.2176 

1.105 

0.0911 

1.157 

0.0591 

For  six  different  types  of  time  series,  this  table  lists  empirical 
means  and  variances  of  MLE  and  the  quadratic  estimator.  All 
series  use  a2  —  1,  if  =  0.01.  r  =  0.1  is  used  in  (36)  for  <Tq.  The 
variance  of  both  estimators  has  the  convergence  rate  \/yJn. 
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Table  3:  Summary  and  Comparason  of  a2M  and  Oq 

Is  the  estimator     Gaussian  equal    spaced  correlated 

sensitive  to:  noises?  variances?  noises? 

MLE         Bias         No  Yes— ►  0  Significant 

Var(<7^)      O(fijn)  O(yffjn)  Not  Converge 

Q.E.         Bias         No.  Yes^  0  Small 

Var(^)      0(1 /y/n)  Q(l/M  Q(UM 

The  summary  of  this  table  is  based  on  empirical  simulations 
incuding  ones  not  listed  in  this  paper. 


community.  Currency  spot  quotes  have  widely  recognized  noises.  However,  a 
stock  transaction  price  recorded  precisely  may  still  have  so-called  observation 
noise.  The  noise  is  simply  the  deviation  of  the  price  from  an  assumed  under- 
lying continuous  process  and  may  prefer  to  be  called  a  different  name  in  such 
application.  The  observation  noise  can  include  micro-activies  of  the  market 
that  is  not  interested  in  applications.  If  high  frequency  data  is  used  in  study- 
ing the  macro-activity  of  the  market  such  as  daily  volatility,  the  variance  of 
daily  price  change,  it  is  important  not  to  be  overwhelmed  by  micro-activities. 
The  advantage  of  using  high  frequency  data  to  estimate  parameter  such  as 
daily  volatility  is  that  we  can  estimate  the  volatility  day-by-day  rather  than 
an  average.  The  MLE  estimator  developed  here  has  been  applied  to  many 
financial  data  such  as  the  currency  exchange  rates  and  prices  from  futures 
market.  The  results  are  not  listed  here  because  that  the  true  volatilities  are 
unknown  and  no  comparisons  can  be  made.  The  estimators  developed  in 
this  paper  can  be  generalized  to  estimate  the  multi-dimensional  covariance 
matrix.  Treating  micro-activities  as  observation  noise,  one  can  construct  a 
consistent  estimator  of  covariance  matrix  (Zhou  1994). 

The  f-consistency  and  observation  noise  issues  also  exist  in  many  other  ap- 
plications. A  lot  of  manufacturing  processes  have  generating  high  frequency 
data  series  in  last  decade.  To  study  such  type  data  and  to  perform  parameter 
estimation,  one  has  to  be  aware  of  the  observation  noises  and  examine  the 
consistency  of  high  frequency. 
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Appendix: 


i)  Proof  of  Theorem  1,  the  asymptotic  behavior  of  the  Fisher  information 
matrix  (20): 
From  (18), 


/(jx,*W)  = 


f  r    _    _fa    _ 

^  n2(<r'/n+»?2 Ai) 
0 


0 


0 


\ 


V 


^  2n2(<72/n+r,2A02       ^  2n(<r2/n+tj2Ai)2 

A? 


"  5-  2n(cr2/n+r,2Ai)2  ^  2(<r2/n+rj2Ai)2      / 

First,  I  prove  that  the  (2,2)-th  element  of  the  matrix 


E 


i 


+  o(y/n). 


2n2(a2/n  +  r)2\i)2       8a4^y 
Recall  that  A,  =  4sin2(2,  *+1)).  We  can  easily  prove  that 

1 


(l/n  +  47sin2(7rx/2))2: 
is  a  decreasing  function  of  x  G  [0, 1]  and 

1 


7  >  0 


r i 

•/^r(l/n  +  47si 


r  (l/n  +  47sin2(7ra:/2)) 


-dx    < 


< 


n+l 


A  n+l 

Jo 


E 


"n+l''"'n+l 


„    (l/n  +  47sin2(7ra:/2))2 


1 


(l/n  +  47sin2(7rx/2))2 
The  maximum  value  of  1/(1  jn  +  47X2)2  over  [0,1]  is  n2.  Therefore 
1  ^  1 


dx.        (37) 


n  +  l 


E 


,    (l/n  +  47sin2(ra/2))2 


'n+l'-  ''n+l 


Jo    [t  +  47 sin  (?x)r 


(i  +  47sin2(fx))^ 
(5  +  47) 


L(i  +  4>v/^+4^ 


Arctan((^^)tan(^) 
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47  sin  (7r  a:) 
+  (£+4X^  +  47-47Cos(7rx)) 

(^+47)  7T 


n=i 


+  0(n) 


x=0 


+  0(n) 


47      1 


3/2 


4-  0(n3/2) 


7; 


4^/7 


+  o(n3'2) 


Therefore 


E 


1 


2n2(a2/n  +  ^Ai)2 


n  +  1  .  n3/2         .  ,/2.. 


8a4  ^ 


+  o(V")- 


Next,  I  prove  the  (3,3)-th  element  of  the  matrix 

A2  n 


E 


+  o(n) 


2(a2/n  +  T)2\i)2       Irf 
It  is  easy  to  see  that  x2/(l/n  +  7x)2  is  an  increasing  function  of  x.  Therefore, 

sin4(7nr/2)) 


(l/n  +  47sin2(7rx/2))2 


,    7>0 


is  an  increasing  function  of  x.    The  maximum  value  of  above  function  is 
l/(47)2.  Using  the  similar  technique  as  used  above, 


E 


sin4(7rx/2)) 


n  -\  ^    B    (1/n  +  47  sin2(7rx/2))2 

2 


=  L  a 


sin4(f") dx  +  0(1-) 

(I  +  47sin2(fx))2d:C  +  CV 


(4  +  122) 


1672(j+  47)^+4* 


27ri  __   I2yirx 


16727rx  +  16727rxcos(7rx)  +  4^(7rxcos(7rx)  —  sin(7rx)) 
(1672(^  +  47)7r(-^  -  47  +  47cos(7rx)) 


1=1 


J  1=0 


+0& 
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(£  +  12*) 


JL  ,   -S-^-167^-167a7r  +  ?(-7r)  1 

1672(I  +  47)7ryT^I  2  (1672(1  +  47)7r(-£  -  47  -  47)  V 


(12*) 


1 


-3272 


1672(47)y4^2      1672(+47)(-87) 


+  o(l) 


1 


1672 
Therefor 


+  o(l). 


A? 


162(n  +  l).    1 


(t^  +  °(1))  =  ^  +  °(")- 


n 

2rf 


2(<r2/n  +  r/2A,)2  2aA       v1672 

It  is  slightly  tricky  to  prove  the  asymptotic  result  of  (2,3)-th  element 

A,  y/n 


£ 


Function 


2n(a2/n  +  r)2\x)2  "  8a4v^ 
sin2(7ra;/2) 


+  o(y/n). 


(l/n  +  47sin2(7ra:/2))2 

is  not  monotone.  However,  it  is  positive  and  has  maximum  one  turning  point. 
The  maximum  value  of  the  function  is  ^-.  The  similar  technique  can  be  used 
here 


1 


n+  1 


.     1  n 

"n+1  '"'n+l 

1 


sin2(7rx/2) 


i  sin2(fx) 


L  a 


(i  +  47WA  +  4^ 
sin(7rx) 


(l/n  +  47sin2(7rx/2))2      Jo    (±  +  47sin2(fx)) 


,(^  +  47)tan(fx). 
Arctan(^ —    "     _2   ') 


;dx  +  0(l) 


Vr  n 


i=l 


(i  +  47)tt(-|  -  47  +  47cos(7rx))Jx=0 

1        ^        /  /-\ 
+  o(y/n) 


+  0(1) 


477T^4f  2 


16v9* 


+  0(^/71). 
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Therefor 


E 


A, 


2n{a2/n  +  r?2^)2 


4(n  +  1)      y/n  .  r 


Now,  I  come  back  to  the  (l,l)-th  element 


+  o(y/n). 


First, 
Therefore 

Since 


^[E-1]ii/n2  =  -  +  o(l). 

y 

^-x  =  Vndmg(l/(a2/n  +  rj2Xl))Vn. 
£[^  =  £(t&/(*Vn  +  r?2Ai)). 


«j 


u 


XX  =  ^EsinV/(«  +  i)) 


=  i  - 


=  i, 


1     cos(inir/(n  +  1))  sin(i7r) 
n  +  1         sin(z7r/(n  +  1)) 


XX-1  W"2  -  En2((72/n  +  7?2Aiy 


It  is  easy  to  argue  that 


^nV/n  +  A)=:  "2^2  -/o   i+47sin2(fx)<iE  +  0(1/n) 


n+  1 


712(72     W^+4n 
1  2         7T  .     1     . 

+  o(-/=) 


2  a  /(^  +  47)tan(|x) 

Arctan(-' 


11=1 


,A  +  4^ 

yn'  n 


+  0(l/n) 


->  x=0 


na2n./4^2  y/n' 

V    n 


2a2y/yn  y/n 
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ii)  Proof  of  Theorem  2,  the  optimal  matrix  Q  for  the  quadratic  estimator: 
For  the  optimization  problem 

min$H9y(-  +  A«r)2 
ij         n 

subject  to    ^<7ii  =  n     and     ^2qnK  =  0, 

i  i 

I  define  Lagrangian  function 

*(Q;a,/J)  =  E€(-  +  A«r)2  -  «(E*  -  n)  -  /*(£&*) 

ij  n  i  i 

Obviously,  (jij  =  0  for  i  ^  j.  For  z  =  j,  differentiate  4>  with  respect  to  each 

Qu, 

Q  =  2qil(-  +  \lr)2-a-p\l. 
n 

Conditions 

J2  On  =  n     and     ]T  <?»Ai  =  ° 


lead  to  follow  equations 


1  „x-  A, 


2n    "    a£(l/n  +  V)'+/3£(l/n+\r)» 
A  A2 

°    =    a^(l/n  +  Air)2+/?^(l/n  +  Air)2 

and 

a  +  /?Ai 
9.. 


2(l/n  +  V)2' 


iii)  Proof  of  Theorem  3,  the  convergence  rate  of  the  quadratic  estimator  <Jq. 
From  Theorem  1, 

1  n5/2 

S(l/n  +  V)'    =    V?  +  0("S/2) 
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A,  -3/2 


^(1/n  +  V)2     ~~     4r3/2  +  °(n3/2) 
(1/n  +  V)2    =    H+°(n)- 


Therefore, 


n  n  n 


a    =    (2„)(^+0(„))/(g^  +  0(n^))=^+o(n-^) 

"  =  ^)(-^  +  ^3/2))/(^^  +  °c7/2))  =  -|  +  <'(^ 

Rewrite  the  variance 

Var(aJ)    =    a*  ]Tg2(-  +  Al7)2 

=    ^E^K1  +  W  +  2(7  -  r)(±  +  A,r)  +  (7  -  r)2(^  +  A,r)2] 
i  n 

It  is  easy  to  check  that  the  first  term 

E*<5+vr  -  TD^vr!  +  o(i) 

=    -^r  +  o{-=). 
y/n  y/n 

The  other  two  terms  can  be  proved  by  similar  techniques.  It  is  a  long  and 
tedious  calculus  manipulation.  Numerically,  one  can  easily  verify  following 
equations: 

i        n  y/rn         y/n 

£*A?   =   ^7  +  4) 
i  2Vr3y/n         \/n 
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Therefore 


Var(aJ)    =    ^E^  +  Ai7)2 

.j         n 

.    (4Vf  +  2^+<l^)^+o(4=). 
v/r  2V73   V"        V" 
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